-
-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v3] Hierarchy api #1912
base: main
Are you sure you want to change the base?
[v3] Hierarchy api #1912
Conversation
…nto hierarchy_api
|
||
@classmethod | ||
def from_dict( | ||
async def from_dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a notable change that was needed to get the hierarchy API to work. Previously, from_dict
was not async, but it should be.
return Array.from_dict(store_path=store_path, data=self.to_dict()) | ||
|
||
@classmethod | ||
def from_array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a convenience method that makes it less painful to create ArrayModel
instances, because it uses as many defaults / inferred values as possible.
codecs: Iterable[Codec | JSON], | ||
attributes: None | dict[str, JSON], | ||
dimension_names: None | Iterable[str], | ||
codecs: Iterable[Codec | JSON] = (BytesCodec(),), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some default values here to make array creation easier. happy to revert if this is controversial.
* Run sphinx directly on readthedocs * Update doc build script
Bumps the actions group with 6 updates: | Package | From | To | | --- | --- | --- | | [actions/checkout](https://github.com/actions/checkout) | `3` | `4` | | [github/codeql-action](https://github.com/github/codeql-action) | `2` | `3` | | [actions/setup-python](https://github.com/actions/setup-python) | `4` | `5` | | [actions/upload-artifact](https://github.com/actions/upload-artifact) | `3` | `4` | | [actions/download-artifact](https://github.com/actions/download-artifact) | `3` | `4` | | [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) | `1.8.10` | `1.8.14` | Updates `actions/checkout` from 3 to 4 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](actions/checkout@v3...v4) Updates `github/codeql-action` from 2 to 3 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@v2...v3) Updates `actions/setup-python` from 4 to 5 - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@v4...v5) Updates `actions/upload-artifact` from 3 to 4 - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@v3...v4) Updates `actions/download-artifact` from 3 to 4 - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](actions/download-artifact@v3...v4) Updates `pypa/gh-action-pypi-publish` from 1.8.10 to 1.8.14 - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.8.10...v1.8.14) --- updated-dependencies: - dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: github/codeql-action dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: actions/setup-python dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-patch dependency-group: actions ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Joe Hamman <[email protected]>
* Apply ruff rule RUF022 RUF022 `__all__` is not sorted * Apply ruff rule RUF029 RUF029 Function is declared `async`, but doesn't `await` or use `async` features.
RUF009 Do not perform function call `cast` in dataclass defaults
* feature: group and array path/name/basename properties * tests
* implement .chunks on v3 arrays * remove noqa: B009 * make mypy happy * only return chunks for regular chunk grids --------- Co-authored-by: Davis Bennett <[email protected]> Co-authored-by: Joseph Hamman <[email protected]>
updates: - [github.com/astral-sh/ruff-pre-commit: v0.4.5 → v0.4.7](astral-sh/ruff-pre-commit@v0.4.5...v0.4.7) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This PR adds a declarative API for defining Zarr arrays and groups independently of storage. Using this API, users and developers can create and manipulate Zarr hierarchies, adding nodes and modifying their attributes, and serialize the hierarchy to storage with a single method call.
Implementation
This PR adds a module called
hierarchy.py
that contains two classes,ArrayModel
andGroupModel
, which model Zarr arrays and groups, respectively. "Model" here is an important concept;ArrayModel
has all the array metadata attributes likeshape
anddtype
, butArrayModel
has no connection to storage, or chunks, so you can't useArrayModel
to read and write array data. Similarly forGroupModel
-- it has all the static attributes of a Zarr group, but no connection to storage, so you cannot access sub-groups or sub-arrays with aGroupModel
. (You can, however, access sub-GroupModel and sub-ArrayModel instances, but these are just models). The classes are pretty simple, so I will just paste the current code here:Goals
zarr.json
metadata documents in a large hierarchy, which should vastly speed up these interactions on high latency storagedict[str_that_obeys_path_semantics, ArrayModel | GroupModel]
. This has been useful over inpydantic-zarr
for a variety of things, and I think it would be useful here. It could also provide a serialization format for consolidated metadata in zarr v3, which so far has not been defined.Process
Unlike a lot of other v3 efforts, this PR adds new functionality that was never in
zarr-python
before. I'm basing the design here on work I did over inpydantic-zarr
, so there's some of prior art, but I am happy to explore and experiment as needed. It might take a while before we have an API everyone is happy with.